Skip to content

Add image, file, and dataset attachments for web and CLI#186

Open
aryan5v wants to merge 6 commits intohuggingface:mainfrom
aryan5v:add-image-file-and-dataset-attachments
Open

Add image, file, and dataset attachments for web and CLI#186
aryan5v wants to merge 6 commits intohuggingface:mainfrom
aryan5v:add-image-file-and-dataset-attachments

Conversation

@aryan5v
Copy link
Copy Markdown

@aryan5v aryan5v commented Apr 29, 2026

Summary

Based on #157 and #158.

This PR adds user-selected image, file, and dataset attachments across the web UI and CLI.

The main design choice is to split attachments into two explicit modes:

  • Attach to turn: files/images are used as context for the next agent turn only.
  • Import as dataset: files are uploaded to a private Hugging Face dataset repo and injected into the agent turn as durable dataset manifest context.

What changed

Shared attachment layer

Adds agent/core/attachments.py, which centralizes:

  • filename sanitization
  • file validation
  • MIME/type detection
  • size metadata
  • text previews for readable files
  • local per-turn manifests
  • private HF dataset imports
  • dataset manifest generation
  • model-context notes for attached files

Web UI

The chat composer now supports:

  • paperclip file picker
  • drag/drop files and images
  • removable attachment chips
  • upload error display
  • Attach to turn vs Import as dataset toggle

When importing as a dataset, files are uploaded to a private dataset repo before the chat turn is submitted.

CLI

Headless mode now supports:

ml-intern "summarize this" --file ./data.csv
ml-intern "inspect this screenshot" --image ./screenshot.png
ml-intern "train on this" --dataset ./train.jsonl

Interactive mode now supports:

/attach ./data.csv ./notes.txt
/dataset ./train.jsonl

--file, --image, and /attach are per-turn context only.
--dataset and /dataset upload to the private HF dataset repo.

Agent context and privacy

Attached text-like files include metadata and bounded previews in the agent turn.

Images are sent transiently as multimodal image parts so the LLM can inspect them, then redacted back to placeholders/manifest notes before persisted history or approval continuations can reuse the turn.

Dataset imports are stored under:

{username}/ml-intern-user-datasets
sessions/{session_id_or_cli_run_id}/{upload_id}/...

The agent receives the repo id, path prefix, manifest path, file list, sizes, MIME types, and guidance to use the HF dataset path for training/jobs.

Docs

Updated README with CLI and web usage examples for:

  • --file
  • --image
  • --dataset
  • /attach
  • /dataset

CLI help now also lists /attach and /dataset.

Safety/correctness details

  • Plain attachments do not upload to the Hub.
  • Dataset uploads only happen through explicit dataset commands or the web Import as dataset toggle.
  • Raw image bytes are not persisted in session history.
  • Upload refs are preserved for Claude quota retries.
  • Duplicate multipart filenames are written to unique temporary paths to avoid accidental overwrite.

Test plan

  • python3 -m py_compile agent/core/attachments.py agent/core/agent_loop.py agent/main.py backend/routes/agent.py backend/session_manager.py
  • uv run pytest tests/unit/test_cli_rendering.py tests/unit/test_session_manager_persistence.py tests/unit/test_attachments.py
  • npm run build

All targeted tests pass. npm run build passes with the existing large chunk warning.

Future improvements

  • Add a richer Codex-style CLI/TUI composer with visible [Image #1] / [File #1] draft chips before submit.
  • Support easier terminal drag/drop flows where available, instead of requiring explicit /attach PATH or --image PATH.
  • Add recursive folder import and larger/resumable uploads if users need bigger dataset ingestion flows.
  • Add direct image-specific UX polish in web, such as thumbnails or preview-on-hover.

Co-authored-by: Cursor <cursoragent@cursor.com>
@aryan5v
Copy link
Copy Markdown
Author

aryan5v commented May 1, 2026

@lewtun can you help run the claude PR review

@fglogan
Copy link
Copy Markdown

fglogan commented May 3, 2026

closed per maintainer request

@aryan5v
Copy link
Copy Markdown
Author

aryan5v commented May 3, 2026

closed per maintainer request

Don't know what you're talking about and why you're spamming this everywhere.

This PR is ready for review when someone has time. Thanks!

aryan5v added 2 commits May 7, 2026 16:23
…-158

# Conflicts:
#	README.md
#	agent/core/agent_loop.py
#	agent/main.py
#	frontend/src/components/Chat/ChatInput.tsx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants